XClean in Action A Demonstration of Declarative XML Data Cleaning

نویسندگان

  • Melanie Weis
  • Ioana Manolescu
چکیده

We demonstrate XClean, a data cleaning system specifically geared towards cleaning XML data. XClean’s approach is based on a set of cleaning operators. Users may specify cleaning programs by combining operators using the declarative XClean/PL language, which is then compiled into XQuery. We plan to show XClean in action on several scenarios based on real-world data. A graphical user interface supports users in writing XClean/PL programs and guides them through the process to obtain the clean data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XClean in Action (Demo)

We demonstrate XClean, a data cleaning system specifically geared towards cleaning XML data. XClean’s approach is based on a set of cleaning operators. Users may specify cleaning programs by combining operators using the declarative XClean/PL language, which is then compiled into XQuery. We plan to show XClean in action on several scenarios based on real-world data. A graphical user interface s...

متن کامل

Declarative XML Data Cleaning with XClean

Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a crucial task in customer relationship management, data mining, and data integration. With the growing amount of XML data, approaches to effectively and efficiently clean XML are needed, an issue not addressed by existing ...

متن کامل

Duplicate detection in XML data

Duplicate detection consists in detecting multiple representations of a same real-world object, and that for every object represented in a data source. Duplicate detection is relevant in data cleaning and data integration applications and has been studied extensively for relational data describing a single type of object in a single table. Our research focuses on iterative duplicate detection i...

متن کامل

ARKTOS: A Tool For Data Cleaning and Transformation in Data Warehouse Environments

Extraction-Transformation-Loading (ETL) and Data Cleaning tools are pieces of software responsible for the extraction of data from several sources, their cleaning, customization and insertion into a data warehouse. To deal with the complexity and efficiency of the transformation and cleaning tasks we have developed a tool, namely ARKTOS, capable of modeling and executing practical scenarios, by...

متن کامل

Apply Uncertainty in Document-Oriented Database (MongoDB) Using F-XML

As moving to big data world where data is increasing in unstructured way with high velocity, there is a need of data-store to store this bundle amount of data. Traditionally, relational databases are used which are now not compatible to handle this large amount of data, so it is needed to move on to non-relational data-stores. In the current study, we have proposed an extension of the Mongo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006